• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ½ÅÁ¶¾î ¹× ¶ç¾î¾²±â ¿À·ù¿¡ °­ÀÎÇÑ ½ÃÄö½º-Åõ-½ÃÄö½º ±â¹Ý Çѱ¹¾î ÇüÅÂ¼Ò ºÐ¼®±â
¿µ¹®Á¦¸ñ(English Title) Korean Morphological Analyzer for Neologism and Spacing Error based on Sequence-to-Sequence
ÀúÀÚ(Author) ÃÖº´¼­   ÀÌÀÍÈÆ   À̻󱸠  Byeongseo Choe   Ig-hoon Lee   Sang-goo Lee  
¿ø¹®¼ö·Ïó(Citation) VOL 47 NO. 01 PP. 0070 ~ 0077 (2020. 01)
Çѱ۳»¿ë
(Korean Abstract)
Çѱ¹¾î Ä¿¹Â´ÏƼ µî¿¡¼­ ¼öÁýµÇ´Â ÀÎÅÍ³Ý ÅؽºÆ® µ¥ÀÌÅ͸¦ ÇüÅÂ¼Ò ºÐ¼®Çϱâ À§Çؼ­´Â, ¶ç¾î¾²±â ¿À·ù°¡ ÀÖ´Â ¹®Àå¿¡¼­µµ Á¤È®È÷ ÇüÅÂ¼Ò ºÐ¼®À» Çس»¾ß ÇÏ°í, ½ÅÁ¶¾î µîÀÇ »çÀü ¿Ü ¾îÈÖ ÀԷ¿¡ ´ëÇÑ ¿øÇüº¹¿ø ¼º´ÉÀÌ ÃæºÐÇØ¾ß ÇÑ´Ù. ±×·¯³ª ±âÁ¸ Çѱ¹¾î ÇüżҺм®±â´Â ¿øÇüº¹¿ø¿¡ »çÀü ¶Ç´Â ±ÔÄ¢ ±â¹Ý ¾Ë°í¸®ÁòÀ» »ç¿ëÇÏ´Â °æ¿ì°¡ ¸¹´Ù. º» ³í¹®¿¡¼­´Â ½ÃÄö½º-Åõ-½ÃÄö½º ¸ðµ¨À» ±â¹ÝÀ¸·Î ¶ç¾î¾²±â ¹®Á¦¿Í ½ÅÁ¶¾î ¹®Á¦¸¦ È¿°úÀûÀ¸·Î ó¸®ÇÒ ¼ö ÀÖ´Â Çѱ¹¾î ÇüÅÂ¼Ò ºÐ¼®±â ¸ðµ¨À» Á¦¾ÈÇÑ´Ù. º» ¸ðµ¨Àº »çÀüÀ» »ç¿ëÇÏÁö ¾Ê°í, ±ÔÄ¢ ±â¹Ý Àü󸮸¦ ÃÖ¼ÒÈ­ÇÑ´Ù. ÀϹÝÀûÀ¸·Î »ç¿ëÇÏ´Â À½Àý ¿Ü¿¡µµ À½Àý ¹ÙÀ̱׷¥°ú ÀÚ¼Ò¸¦ ÀÔ·Â ÀÚÁú·Î °°ÀÌ »ç¿ëÇϸç, °ø¹éÀ» Á¦°ÅÇÑ µ¥ÀÌÅ͸¦ ÇнÀ µ¥ÀÌÅÍ·Î °°ÀÌ »ç¿ëÇÑ´Ù. Á¦¾È ¸ðµ¨Àº ¼¼Á¾ ¸»¹¶Ä¡¸¦ ÀÌ¿ëÇÑ ½ÇÇè¿¡¼­ »çÀüÀ» »ç¿ëÇÏÁö ¾Ê´Â ±âÁ¸ ÇüÅÂ¼Ò ºÐ¼®±â¿¡ ºñÇØ ¶Ù¾î³­ ¼º´ÉÀÌ ³ª¿Ô´Ù. ¶ç¾î¾²±â°¡ ¾ø´Â µ¥ÀÌÅͼ ¹× ÀÎÅͳݿ¡¼­ Á÷Á¢ ¼öÁýÇÑ µ¥ÀÌÅͼ¿¡ ´ëÇؼ­µµ ³ôÀº ¼º´ÉÀÌ ³ª¿À´Â °ÍÀ» È®ÀÎÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
In order to analyze Internet text data from Korean internet communities, it is necessary to accurately perform morphological analysis even in a sentence with a spacing error and adequate restoration of original form for an out-of-vocabulary input. However, the existing Korean morphological analyzer often uses dictionaries and complicate preprocessing for the restoration. In this paper, we propose a Korean morphological analyzer model which is based on the sequence-to-sequence model. The model can effectively handle the spacing problem and OOV problem. In addition, the model uses syllable bigram and grapheme as additional input features. The proposed model does not use a dictionary and minimizes rule-based preprocessing. The proposed model showed better performance than other morphological analyzers without a dictionary in the experiment for Sejong corpus. Also, better performance was evident for the dataset without space and sample dataset collected from Internet.
Å°¿öµå(Keyword) ÇüÅÂ¼Ò ºÐ¼®   Ç°»ç ű렠 ½ÃÄö½º Åõ ½ÃÄö½º   ¿øÇü º¹¿ø   ÀÎÅÍ³Ý ÅؽºÆ® µ¥ÀÌÅÍ   morphological analysis   POS tagging   original form recovery   sequence-to-sequence   internet text data  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå